OOCOHENT RESOfIB 



ED 108 510 



PL 006 981 



AUTHOP 
TITLE 

PUB DATE 
SOTF 

FDFS PhlCE 
DESCRIPTCPS 



IDEHTIFIEPS 



Keller^ Howard H, 

Pclysesy an^i Hoaonyiy: An Investigation of Word Porss 
a* ! Conce pt R^or^^sentat ion. 

MF-fO.Tf. HC-f1.95 PLUS POSTAGE 

♦Co»putation?il Linguistics; Coaputer Programs; 

Inforiation Processing; Language Instruction; 

Mat heaat ical Linguistics; Second Language Learning i 

♦Semantics; ♦Vocabulary; ♦Word Frequency; ♦Word 

Lists 

Hcaonyas ; ♦Polysemies 



APSIFACT 

Language teaching requires tpr^book material that 
contains th^ »ost frequent concepts of axlanquaj^e. The coaputer 
brings its treaendous inforiation processing ability to the task of 
establishing word frequency rankings, but the coaputer is liaited to 
-^ounting vord-toras and not seaantic concepts. The aost recent word 
r!spquency dictionaries, in fact, exclude parsing and leaoatization 
froa their data base (Kucera and Francis, 196"^: John B, Carroll* 
1971), This p^iper describes the probleas involved in adjusting a list 
of the 7,000 aost frequent English words (word-foris) for * 
polyseaantic variants (e,g,, cardinal ♦•bird" v,s, "church dignitary*') 
and for hoaonyas (e.g., pawn "chess piece" vs. "pledge for a loan"). 
Polyseay and hoaonyay present a signi'ficant problea in that one 
¥ord^fora often expresses two or aore differing concepts. The 
converse of this problea is synonyay-tvo or aore vord-foras 
expressing one ccncept (e.g., "freedoa", "liberty"). The resolution 
of the difference between worS^foro and concept representation is 
iapcrtant for accurate coarputerized frequency rankings and for 
concept inclusion in various "thousand" frequency groups. These 
probleas will also be studied in connection with the establishment of 
a univ<»rsal concept list for stud^n^ r<^view of foreign language 
vocabulary. (Aut h^r) 



♦ Docuaents acquired by FPIC include many inforaal unpublished ♦ 

♦ aaterials not available froa other sources. EPIC aal^es every effort ♦ 

♦ to obtain the best copy available, nevertheless, iteas of aarginal ♦ 

♦ reproducibility are often encountered and this affects the quality ♦ 

♦ of the aicrof iche- and hardcopy reproductions EPIC aake^ available ♦ 

♦ via the EPIC Document Reproduction Service (SDPS). SDPS is not ♦ 

♦ responsible for the quality of tUe original docuB*=»nt. Reproductions ♦ 

♦ supplied by SCPS are the best that can be aade froa the original. ♦ 



ERLC 



Polysemy and Homonymy: An Investigation of Word Forms 

\nd Concept Representation 



Howard H. Keller ^ 
Department of Foreign Languages 
Murray State University 
Murray f K'^ntuoJcy 42 0T1 



s lu T Vt T Of HI At '» 

t OUC AT»0»H * t ^*«» 
SAT I0N4L tNSTilUTC 
t Due AT>ON 



Howara H. JielJ^er 
Murray State University 
Murray, Kentucky U2071 



Keller 1 



The Problem 

For decades the computer has been an invaluable aid in language, 
research and in the preparation of language teaching materials. The 
complexity of grammatical rules and the large amounts of vocabulary 
have required automatic data processing routines for effici^t handling. 
Vocabulary anialysis is one major area of language pedagogy that is ideally 
suited for computer processing, and yet little work of any great range X 
or significance has been done in this specialty. / 

The vocabulary system of a lang\iage poses a unique problem to the 
student of a foreign lemguage. In order to communicate in a language 
a student must master several systems. He must learn some 30 to 1*5 soionds 
and their combinations, phonology, some 50 to 100 grammatical rules (and 
their ever-present exceptions! ) , morphology and syntax, and a set of 
vocabulary words that represent all areas of daily life which can be 
expected to occxir in normal reading or conversation. The number and 
complexity of grammatical and phonological rules is significant, but 
this nuKlfaer is small in comparison with the si?^ of a vocabulary that 
is re4aired for ease in communication in that language. 

Studies have indicated that an ability to recognize 7|000 words 
is sufficient to cover all areas of daily use, and an ability to use 
3,000 words actively is a workable minimum for a person's expressive 
nee^p.2 A statistical study of the 5,000,000 word corpus published in 
The American Heritistge Word Frequency Book (AHWFB) shows that the most 
frequent 7»000 words will occur five times or more in every average 
group of 1,000,000 running words. ^ 

Since a student must devote two or three years of scudy to mastering 

3 V 



Keller 



a language, even this large number of vocabulary items is not a problem • 
in itself. There is a twofold problem in the efficiency of the process, 
however. 1. The student encounters each of the 7,000 words in an unordei^d, 
random manner, and 2. He has no overview of how much he has learned and 
how much remains to be learned. 

Except for series where related concepts are learned together 
(days of the week, months of the year, numbers, etc.), the student encounters 
each new word in a language text in a haphazard order. Even first year 
texts which claim to present the 1,000 or 2,000 most frequent words of 
a language differ widely in the actual lists of vocabulary words that they 
present to students.'* A reading passage on HOUSE, for example, may have 
several words on FOOD, BODY, and HEALTH. A passage on MUSIC will probably 
not be limited to that topic, but could also include words from the categories * 
of Mlin), FEELING, 9^ COMMUNICATION. It is a rare text that will then give 
a review list of vocabulary words arranged according to topics. 

It is obvious that it is much easier to learn a word, e.g. Ger. Bucht 
'bay' in a list of words that have a common topic, Ger. See , Meer, Fluss, 
Bucht , Hafen, Kuste , Strand , rather than in a semantically unordered list, 
Ger. Buch, Buche > Buchse, Bucht, Buckel. In 'actual use of the language^^ 
the student will see a word in a meaningful context; for this reason he 
should also have tho benefit pf a topical arrangement in his review. It 
is vastly more efficient and instruct i'*e to run^ one's eyes down a colimm 
of words that ^are a common semantic classification. In a list of 
unordered foreign words the lecoTier is^ instantly shifting mental 'gears' 
as each new word calls to mind the vivid image of a new object: 'book, 
beech tree;^^in can, bay, hump' (see the German example above). 



ERLC 



Keller 3 



1 , 

This idea is stated in greater detail in the Wort f eld (word field) 
theory of Jost Trier and Leo rfeisgerber. Their premise is that word 
contents or meanings of a form are rarely comprehende.d in isolation 
b\it rather are influenced and even determined by other words , and that 
one word evokes a picture of semantically related words ('field 
neighbors') in the consciousness of the speaker or hearer,^ For example, 
in the series BIRD there is a lexical continuxjm of content, and a person 
who reads down the listing 'dove, pigeon, crow, raven, owl' would have 
a mentcJ. picture of one characteristic of each particular bird together 
with the general concept 'bird'. Each new word, 'parrot, ostrich, 
peacock, swan, stork', brings not only the characteristic of that bird 
with it but also a large element of the entire field as well. A rapid 
review of words from unrelated-Vord fields in flash-card sequence 
would prove cumbersome due to the continuum of word field associations 
that accompany each word. 

A student who ultimately hopes to recognize 7»000 words must have 
a view of his progress and an overview of the entire system.- Many student 
wno have diligently mastered 1,000 words at a beginning level feel that 
they already know a great proportion of the langiiage, and so they are 
surprised when they continue to encounter commonly used words in lesson 
after lesson. A student becomes confusea when he sees no recognizable 
end to the learning process, and wh-^'^ he hais no way of really knowing 
which words he has already learned. For these reasons many students 
abandon language studies after one or two years with the idea that it 
is impossible to learn a language in a reasonable amount of time. Still 
more students never attempt a Icuiguage becatise of a mistaken impression 



Keller k 



of the awesome number of words that must be memorized as peirt of an alien 
•code'. 

The Topical Vocabulary Checklist 

Depressed language enrollments dictate that a solution to this 
problem must be found. The key to providing order and system in vocabulyy 
instruction is the division of 7»000 vocabulary items into manageable 
and workable categories and subcategories. 

I propose a published checklist of all words which a student is 
likely to encounter in his language studies. This list will be divided 
into topical categories, and nouns will fill out and* define the principal 
word groups. Verbs, adjectives, and adverbs will be listed u^.der the 
appropriate noun categories. Each word entry will also carry a number 
from an authoritative word frequency dictionary indicating the frequency 
of that word. For most efficient use the list might be printed in several 
versions: a complete list of 7*000 words and a beginning and intermediate 
student's version of 2,000 and ^^,000 words each. This topical checklist 
could be divided into U6 categories of differing lengths (sec Table l), 
and the list will be published in English. The checklist will serve as a 
type of preprinted notebook with a place for the student to write each 
new word as he encounters it either in bis language classes or in his 
independent reading. There^ill be an alphabetical index to permit 
rrpid word locatiol^^^Syr" speedy transfer of large numbers of words from 
dictionaries. Since the list is in English it can be used uniformly for 
all European languages, and there will be blGuik spaces at the end of each 
topic for words not covered by the lists. 



Keller 5 



The best source of words for this checklist is Helen S. Eaton, An 
Engl i sh-Fr ench-Ge man- Span i sh Word Frequency Dictionary .^ This work is 
a composite of frequency studies in foxir languages, and provides fairly 
complete coverage of all topicaJ specialties • Although it has the fault 
of age (which will be discussed later), it is a unique work and is widely 
available in paperback. It is a dictionary of meaningful words, and not 
a listing and counting of logical 'forms'. Its great advantage over 
word lists is that every word is parsed (c^otated for part of speech), 
and lemmatized (separated for polysen^y) by virtue of the translations of 
each word. The 6,500 words in the main part of the study cin be divided 
into approximately 3,500 nouns, 1,500 verbs, 1,300 adjectives, and 200 
adverts. Nouns are the easiest to classify because they denote either 
concrete objects or easily comprehensible abstract coi^ep^. Verbs, 
adjectives, and adverbs are listed with the appropriate noun category. 

Creating a Concept List from Dictionary Word Listings * 
The computer is the ideal device for handling and classifying the 
^6^500 words in Eaton and ordering them into a complete topical vocabulary 
^^^ccklist* I would like to describe some of the problems in setting up 
this list, and I will outlin*^ the role of the computer in handling th<ise 
problems. 

The difficulties concerned with imposing workable^ categories on all 
useful words of daily life center around the general areas of word classifi- 
cation, word ordering, and word location. The computer was also 'helpful 
with the additional tasks of writing different versions of the list, indexing 
all formats of . the list, and writing cross-references for the words or word 



ERLC 



* 



Keller 6 



frequency annotations obtaineS^^ ^Hilfferent s 



forms that appear more than once. Fre'^jcrtcy notation for each vord is also 
a problem, and the comput^ir also permits simultaneoufi use of several 

ources. 

The main considerations and procedures are the following; 

1. Parsing, or the listing of each part of speech* separately. 

2. Establishing the topics and subtopics necessary to accomodate a 
corpus of 7»000 words. 

3. Establishing a logical order of words withm each topic and 
subtopic. 

U. Assigning each word to an appropriate topic. 

5. Polysemy: dividing one yo£d with multiple meanings into several 
concepts , each with one specific meaning. « 

6. Synonymy: facing two different word forms of similar meaning 
under one concept listing, 

7. Establishing concept listings with English wor«is that will still 
permit the listing of foreign words with different definition extensions. 

1. Parts of speech munt indeed be listed separately. A one letter code 
can be used to mark words for their part of speech (N; V, A, D for noun, 
verb, adjective, and adverb), and the sort on these letters Is always done 
first. It will be shown later during a consideration of computerized lists 
that a great many vord forms in English can occur simultaneously as nouns, 
verbs, or adjectives. 

2. A series of topics and subtopics must be established to provide a 
logical and useful divinion of reality. Philosophers and scientists have 
proposed and reworked divisions for reality long before the advent of* the 



ERLC 



Keller 7 



categories of AxistoMe, and \hiti proc^r:in of clansificati^n hois ror.tir.u'-i 
to the present day. The three criteria of ai\ acc^^ptabXe vocabulary org^trar- 
atioD are that it be coDpie*,e» that the grouping:^ be ^laar.itle and logical* 
!hat the 7 ,00C ba^ic words of a lancuaen infons the varicun categories 



^and subcategories in a balanced manner. A glance at Table 1 gi^'es an 
# 

indicatioti of the reasonableness of its subdivisionu ^ and a view of the ^ ^ 
finished product will detenaine if /all 7/JQO words have been assigned in 
even proportions. 

3. Every word in the study D^ost be part of a 1 gicai order under its 

approprit .e topic. Many arbitrary decisions must be made in establishing 

this order, since there is obviously no self-evident way of classiT/ing 

reality. It is vise to establish a limit to the number of words in a 

category (^0 has proven to be a convenient number)- Since subtopics will 

usually contain less than hO words, it becomes easy for the of the 

f 

list to locate the word he is looking for once he l%s found the t:orrect 
topic under which it ia classified. Please see Table 2 for an illustration 
of a sample topic. In examining these lists the advantages of 

Each of the 7,000 words in the stucfy has been keypurjched on an SO 
column data card. This is sufficient to allow a full statement of a 'concept* 
of several synonyiss » the punching of several frequency annotations, and 
an cig>it place topic and order code^ All cards were rearranged to establish 
category location and an internal logical order for each category. A code 
number was then given to each word so that a sort on this nucaber would print 
the libts out in the desired logical ord^^r. A three digit nur^b^^r was sufficie 



/ 



Keller 6 

• 



any two cons^.cu'.; w-r!.^ ;n ih'- Ir't yiVt.r^ui chan^;/^' ccJv rr^>:^Ti> 



of all rf-aair.in^ vori:) . A de^irjxl point v^.i; :nr.crtf i ar> h bo-jnJury t*^tvt»en 

I 



/ 

the two di^it catcgor/ ro^e I'^l to *.C) fmi th^» f.iv-i aigxt ;>4t^^t':gory 



SubcHtcgori Jiv:-,.>nr fu.I h^^aiiM?^ are in-iiiated by a change xn tho 
first digit foJloving th^-- decirial point. Thu^, }|),1G0 to 10.199 a 
asaiiped to TREK, 10. POO to I0.?99 :s for PIA^nj^nd 10.3CD to*10.299 



0 



is for FI/)*T*R. Tne pr;inting routine cfi'jBQs 4 l^rie t*" be sicipp^d each time 

the indr-x niisbor ind^^^^^ti^D a nctr s*ibcatogory * feis ensnares a convenient 

viaual separation vi%nirj each topxc^ The addi^^onal tvc digits do net 

print out unlcs::^ a vord hm been inserted between tvo previously coded 

words » bat thcce digits give the list the T^tentia.! of gr^at flexibiilt/^ 

^. Many vordn have presented probl/ms lo assigning 

unasbx^?uo»i:^ cj^tcv/jry. A large number cf words can indeed fit vith e^^Ead 

justifirat icn -xii* r two :jr -sore cate^-orie^^ ^ glance throu^Ji the follo^m^^ 

/ 

exaspler. giv^.. cone idf.n of the pr^'jbleD' doctor: rPOF*KS!ON (iJ) or 
HOSFIT/IL C?:-^, e^tv^^-n, turk'-y A:i!N!AL (^6), bIHD (07), or FOOD fx?;, 
la^' A;*IMAI. (0^) r V^yO in), invention , di^rov^ry' TECKSIQUE i:^) or 
^j;^SA^Ct: i.'i), \^,f'or^r Mn*:-:N7i:LLHCT or fCIKN^K (?7). Abstract no^jns 

such aii * vp'^r-," i:-ru l\ul'!;nf; (act ofl, d^or^olut xcn , cx^tion, return* 
fail (tiinb!*,, f, ?*irf*f.j^, fact^ itHxxnr.r ^ |rc^''"*i» rean^ , advar.t^i?/-' » 



» In 
ERLC 



9 



ERIC 



generated index i5 gre^i val iin Ic^^^ uig vord,i tn^t tio not h^v^. 
obvious iocatirn. It i-^^ inched tc::^^tir,g to Utst a v.:rd tvo or ttrt-f^ 
it It fits tvo or ti5r«*e categories, t^ut thu^ ^icctroy^; th-i- vne-to-Qnc* 
correspondence ti^tvi^en the topical vocabulary cDf^cR]i^?t ano the source 
lists. It uiii fi^i^n iat^^r th&t Cioltiplc lirimis c*t' uingl^ ^Qti^^for^ 
viU b-e r^qa.ir'^d vhes ose v:;ri«forci expr^?cneB tu?ver?u ::earang& cr concepts 

5, The ph^nor^ftor of poJyae::?^ i-i trotal^Iy the gn^^te^t obntaci*^ in 
what cugibt othen/ise bi' a a^chararal pr;^*cesa of ta>.in*5, a finite li*5t of 
7,000 vcri- and iiisiply r-^rr*irsi?ing tnec^ according to ^.t miini, ^it^ilnritt^^ 

Poiyscsr/ is definei as tne fact that a vord-fors :r-.^y have zote man 
one £2^aGin#^ or sr^iy d^si^^^i^te D^^ire than one ob^^*rct '>r concept in th?? vorH 
outside ;an^j*ige* c^Ktessxvt* ^scaspl** u\ ^arrashcd by the vord-for^ 
'bead* vtich cnr ieaxiTTs^te otj^^ctf* or crncept- ^5 dxv'?rse 65* nead of ^ 
bo4y, cutid (a gci^^^d b#rad)» « ^rug head (u£^*^r)» the otverse {hcadis) Didi- n*^ *i 
coia; an ir.iivii'jal vithin a group (count hcaltO; mi onick^l vithiti a h^rd 

(bead of ^'ittl*?)* a boo^» le^i^lir^ chipf^ or director Chead) of a dcpnntnent* 

J, 

tbe prens-tr*' of liquid or vjtpc^r (>:rai of ^ic^rt) ^ the fo?;r on is/^ of ftrv^^r cc-nt 
liquid (hefid ef le^r), the tip (n^'ad) of a/i abr^cetr , fcoji^ nr picplc, qj* 
turning ^omt or crxsif^ (cos, to « h^^a^i)^ the h*?ad of a screv^ of s^c^lt, of 
a pin; the ht^^i h hnji^^-r ^ the r^^cor lsn^ he ad ^f n tnj^c deck, h hen^i '"f 

\ 



cliff* a^- vk c^r f^^^-* ^ i-'i***^' ry, *l:^5 i.^^iKy, .n;; * ti^'^'^i 



different f^j^ur^'iv*' cr^r^v^'ft.-:* c^o; b*^ r,n;b^^i froD ct*'- ixtcrtd zenrnv,^^ of 
'^^ehd top4Ei>'>t part or s^et^et ir^^portiu^r: part of *i l^irgvr toiy/* Tb'^ci^ 

ar-e cot predict^ibU* fr-os^ ihi^* iotir-:r* C'-^nc^-pt 'toj of a Doly." WV se*^ prcof 
of tfexs in ^-t** f?ici that* vano'X foreign laxigui^:^^ tr^xnslatc* x:ri.3t of tnc 
figiirative ^-xt"'^n^:i3i'.t;'.' of 'ri<*^d' vxih ^ variety of v-^rd forss thij^t ces^ 
rrt^ different figorc^tivc a^nalogi*;^ aiii ?ire fonsei fros different rt^tr. ^ 

It autso l?'?<ro««t^s 2jtpja.r*-^nt frcs ft-ading thr-ou^ a full list of definitions 



for *hefid' %hnt it xa xj:^po-jr-ii>le t^s rr-tabUtilt h prvci^e w^her of '^^esjn.xngs 
or drsv veil dt*fir*t'<i t>:>ur,dAry fcefw^^^on the varin^r ^ i^ix^cd c^jncepts^ 

Mmy techaxca.1 s^t^/Lnmgr t^ccur i% very i*:iitc:S eri^^ircrirv^-^is ^ divi once tfcc 
conacctioR VI th t^**? corit^ft esvat^l lahel^ tne analogy fceccs^s 

evident. Ir^ th*^ n«'Hid a qua.!* tati v* iy -liff^rent froc^the hi?ed of a 
door or vxn-J.u? A hH::jr.- r a:y,\ a p^^n are ver>' dif^'er^jn^ t^x^la, ^^nd y^t the 
concept n^^i r^rtiutv i\r *no * i'j?.^'^ted r r'^J*^rMor3 of b fth tr^ec^e 

of the brio\c wv^r4'; .ti i% l^^vc i.e^x-vj.-^ry ^ntr.^-^t witn i^ev*?rsd :rj^b*^r5 

to i 'dic-^*c the ror.r/- • ; --ir y-: ifili :.*f4 -dei\uitint,:\ "the tm^^^ of 

{*he8d* bed, vu^i-w^ i^^r, r t^t:;'-']» nr v^i^^s vh-r^- r^ithrr iiveri^.** 




ERIC 



Keller 11 



executive a:it:ii:t?*xit;; . ; 

sjore vord.v mat navtf the r^n;:;e sour.i -'rtrn t^*^ ^^^i.^^- ^-^vllm^ iv* 

iiak is alwayr_ present xn [olyiU'trr/ but tr ; . -^'V- *^nvr.* in chcr^ 
and 'pawr*' a f*lcd^?f for a loan ar*? a bn^:::.^uyz ja;r since they io r. t 
abftjre a cozj^xit. ety^^olco'^ <^sj%^cia-ly ;ruc xn exti^i-^t-^s of v^d': 

range pcly^«.'s;y that the ^r^crij^tic lirjc ?i^y i^^t b-? evident at fir^^t 
Ccardanal'. Ixrd v^. church jignitary}, but the '*l^rxd£-e" either leccres 
apparent after n-oce ex^ir-*r.^it ion ::r xt can be f-- tablir^hci In a ??.oci 
reference dictionary. 

The coJnputcr offers: azi^^fui -^tar^ce at t^x^ pcxnt, ^ccauc-: n 
topical vocabulary cn-cKlut be a compilation of co^c^'pt^. cot cf 

vord-forrx. Tr^c ftict tnat 'c^^.^ir,^* if* listed under FELiaiCN :n tr.- criier 
•clergy*, pastor* priest* bi£h3p, fi-^hti^hov. , cariinal . r<^|:e* tust net 
ejtclude a listing i^n^^.r BIRI if Ihv. crd^-r 'oriole, blackbxrd, cari>n%l. _ 
fincn, sparrcv,' T:;^ cosspitf-r t-natlcr. ttc con:piler t:? ketrp track cf th^^ 
source voris Vitb multiple rA^^inir.,?^ alter tr.cao vr^ri-fcrrL- are littei :r. 
aeparate cateforit:^, a:i4 the corf utt r ^klt^c porn^.tr the cocrplatxc^n cf a 
separate Ust of ro^/.e-ati.- vord^-fon:^ m th- appc-ndix to *he chrckl:ct- 
This li€t of :r.t*r^;t ta l'ui*?Uiw>e ;itudenlf tecau^*? i*. shcvs h'-w 
a given lan^.ui»i.^t^ r:/t>.»_i: ar^rle wro of ftf.ut*: ai-d ::;eta:hcr to suprly lin^u:'-:t*c 
ayabol?^ cr "ccJe r^rr;. " f^r th* tr...^-.i^.a: f ..IV^t- *c i r n^^;^t: •-.^i* 

I.'i 



Keller 12 



exist outSiJc thit l*uigua^t;. If tr.e word equivalcntc^ in a tart'et foreigr. 

laxa^^Cn^e are also keypunched ar,d arv matched vith tneir equivalent? in 

the cosputeri ici Lng^Iicr* corp-^. » anC if vcri-fomr- m the foreig.- le^i.^aa^A^ 

ftre also coded for muHiple searan^^s, the ccxr-jter car. repeat the proceed 

f cotipilxni an arpenj;x of polyseru'Lnt Vs^^rdr. in tto target lan^^^^i^e. 

Language gtudentr^ of all levels can then gain ar. in^i^.t into the process 

of coocept-rtrpresentatir^n and concept- adaptation in the target langua^,e, 

aod the student? v;ll see hov a sp<?pch cocsnunity exploits its nati%*e 

vord stock to cov^ v:th the coxTiur;; cations neeks of a chaxiging culture, 

« 

The conpiler of a concept list must use Judgesent in selecting 
the cost co=25on denotations of vord'--those that are relevant to daily 
UBe-« and he cust bype^^s highly specialized or technical denotation?. 
In most caser? a fcrei^ language^ will tran:;la*e the various concepts 
of a specific vord-fora vith different vorc^, nnd so cne can examine tne 
frequencica cf these words in the fctei^, language to detenziine the 
valu^j of the eAttriiutird ur technical soaningr. , 

Table 3 gives a scail ar.diLation of the forsi? that required 
multiple li?,ting3 becaure cf r^-ly^^e^ati c n ?uiing diversity. The cost 
general trend c^ncerr;^ the ^r:up:fg of zn^ cc;.crete i^age and several 
additional at^^trnrt or int«j.R:tl» ^^^-^ .^tat: -rn:: ''^ '^^.g, 'voy*. road and 
nethod, 's->r;:t:t, re^^A* r:-:-.^-t?xir. an an/ ;oint or clin;Ax, ^gcniucV 

the r s on -i t n e ; t y l> f i i f ♦ ; 1 1 * he r g roup i n^:!s ccnc e m 
a person T-.r th;ng *i:vi n c^r^'i*- %7^.v.*y *g-.^,.r' Rrc-^niless rjr_^r and 
a person vhc^ g__r;jr, *r--i:ii-: * ci U'-ll.ng ar.d tr;e act or a period r.f 

win 

living acr-v-vher* , 't:;::;r.^-* . , ^_ f r r-\elter and the a.^* cf 



ERLC 



Keller 13 



const ructin£;; 'painting': a portrait or tainted picture ar i the act of 

painting an>ahxn£/entrarice' : the door itself and the act of going into 

something. A limited number of p-olysen:atic pairs also arise fror* the 

need for technical terns that are restricted in use in a particular 

specialty. Chc^s and caxd-piaymg tenninologj' provide a good example in 

the series of ceanings of the vord-for^*^: 'king, queen, knight, bishop' 

or *king, queen. Jack, sp^e, heart, dianond, club.' These tvo specialties 

nave in turn added sense their own telfEnlcal tenns to the general language 

^e: 'pavn': a person used as an object by another person; 'ace': a person 

vho IS an expert in his field. 

Any atterpt to write dovn one language's categorization of reality 

Bust take into account the fact that each language is in a constant state 

of change. Every language is continually adding new words to its vocabulary 

inventory, and is continually changing the neaning of older words. It is 

a delight to see this process at work, because languages prefer to give 

a form to a nev object or concept from an existing root by using analog\% 

poetic description, or some other figurative device- It is unusual for a 

language to build new word-fonns fros: previously unused ^iound combinations 

a 

in the manner of Jabbcrwocky- After the language user becomes aware of thxs 
process, he can note how a language not only supplies sounds axid sound coin- 
binations for extralmg^uistic objects, but also makes its own descriptive 
statements about these objects and their place in the world. We are so 
accustomed to r^peaking our native lar;guage that we no Icns^r notice the 
underlying retarh^r m thousands of One of the rich experiences 

ir leamint:. ^ for-ugn language is the gaining of f. :^resh view of this 



o 1.') 

ERIC 



Keller 1^ 



vigorous process by using a set of entirely different meaningful word-fonns 
and word roots. 

To siimirarize the problems of word listing, topic location, and 
polysec^: where a word-concept can occur under two or more categories, 
a decision must be made for one category, since only one concept is 
involved and only one listing is possible* Where one word- form denotes 
two or more basic concepts, one must place the several concepts under 
their appropriate topics, and not restrict the listing to the one 
word- form. 

6. The converse to the problem of polyseny (one form and several 

concepts) is the problem of synonymy (several forms that express only 

one concept). John lyons and many other authors point out that no two 

synonyms have exactly identical (coterminous) meaning extensions nor 

8a 

are mutually substitutable in every context or meaning environment • 

Yet English is a language that is particularly rich in synonyms, and 

European languages often have only one word to translate many close 

synonym pairs in English. In order to establish an economical concept 

list that will not irvolve repetitious listing of foreign words to 

/ 

accomodate series of English synonyms, many decisions mtHNt be made in 

r 

word formats. For example, the synonym pair 'freedom, liberty' could 
be listed as one entiy separated by a comma, or the two words could be 
listed as two discrete entries • freedom ' and then , 'liberty . • Automatic 
data processing procedures are useful in resol\rin£ thi^ problem, since 
they permit the indexing of all words that appear not as a main entiy 
but as a synonym aft.r a comma* A special code .*5yrabol is enough to mark 



Keller 15 



these words a3 secondary synonyms. 

Table U gives a short indication of some of the more connnon synonym 
pairs. A glance at this list will indicate that the degree of difference 
in some pairs is greater than in other pairs. In many pairs the difference 
is a technical one, and in loose speech one member of the pair might be 
substituted for the other member, even though there is indeed a difference: 
o.g. tortoise and turtle, hare and rabbit, alligator and crocodile, etc. 
Since the goal of the vocabulary checklist is to classify the more basic 
concepts of a foreign language that a student might encoiinter, even a 
loose grouping of synonyms (where each member of the pair htis an additional 
meaning of its own) is permissable. 

The on^y complication created by synonym pairs is that it is difficult 
to transpose a frequency count of an individual form to that of an individual 
meaning-concept. If we wish to establish the frequency rank of the concept 
•pigeon/dove' where the form 'pigeon' has a frequency-per-million (FPM) 
count of 7.1511 and the form 'dove' an FPM of U.1270, ^ we miist decide 
whether to use the higher individual FPM (of 'pigeon') giving us a ranking j 
in the 6,300 frequency group, or to add the values of FPM for the two forms 
•pigeon' and 'dove' giving us a total FPM of 11.2781 for the concept 
•pigeon/dove* and a consequent ranking in the 1*,800 frequency group. 
For many words or concepts the tocal FPM becomes important in establishing 
criteria of inclusion in frequency rankings (thousand groups) and in lists 
defined by these frequency rankings. If our list is limited to the first 
7,000 words of a language, neither the form 'poultry' nor the form 'fovl' 
would merit inclusion due to their low FPM and consequent low rankings 

\ 



ERLC 



Keller l6 



('poultry' U. 791*6, 8,000 group, and 'fowl' 3.1761*, 9,900 group). The 
single concept * poult ly/ fowl' would m^.^rit inclusion, however, because the 
total FPM of the two word^forms, 7-9710, places it in a ranking with the 
5,900 word group, 

7* The last major problem in establishing a concepts-oriented 
topical vocabulary checklist arises from the fact that each language 
has differing extensions in the denotations of its basic word-concepts. 
Some graphic examples of this are Rus. ruka for both 'hand' and 'arm'; 
Rus. noga for both 'foot' and 'leg'; Fr. doigt for both 'finger' and 
'toe'. Very often the sum of a set of words firom two different cultures 
will have the same total extension or denotation, but the subdivisions 
of the set will have different proportions in each language. For exaii5)le, 
many European and American cultxires have three mealtimes in a day, but 
the definition of the individual members of the set 'breaJffast-lunch-dinner- 
siqpper' varies extensively in terms of si .e of the meal, preparation (hot 
or cold), and time of serving. tOl European cultures xise a 2U-hour day, 
and yet the times and lengths of the following subdivisions vary from 
language to language: morning, afternoon, evening, night. In fact, Russian 
does not have a commonly accepted form for 'afternoon', and while French 
and German do have a word for 'afternoon' ( apres-midi , Nachmitta^), they 
have no equivalent for the greeting 'Good Afternoon'. In Spanish, one 
word- form is often used for both 'afternoon' and 'evening' (tarde), 
and the word- form noche can be used for both 'evening;' and 'night'. 

D. A. Wilkins in his interesting work. Linguistics in Language 
Teaching, exaggerates the problem when he says: "The physical world does 



J8 



Keller 17 



not consist in classes of things, nor are there universal concepts for 

each of which every language has its own sets of labels. Laoguage learning, 

therefore, cannot be Just a matter of learning to substitute a new set 

of labels for the familiar ones of the mother tongue. It is not difficult 

to find a word of equiv6j.ent meaning in a given linguistic and social' 

context- It is most unlikely that the same word would prove eqxiivalent 

in all contexts. Every language classifies physical reality in its own 

way." 

There is indeed a good deal of truth in this statement, but one must 
not interpret it to mean that it is impossible to translate language 
A into l€tnguage B because every concept in A is not coextensive with every 
concept in B, and every word- form in A does not have an exact equivalent in 
B. Even though Wilkins's statement is basically true, we can still translate^ 
two different languages with a fadr degree of approximation, and we can also 
set up concept surveys o^exlraiinguistic reality which any two European 
languages might reasonably be expected to express. 

If the English ^words selected to represent concept groups are kept 
general, and if the Jser of the list is prepared to accept blank spaces for 
some concepts (Rus. 'afternoon', Ger. 'efficient, frustrated') and the 
necessity to enter several foreign words for other single concepts ('cousin': 
Ger. der Vetter, die Cousine ; 'return': Fr. revenir, retoumer , rentrer, 
renvoyer ; 'box (container)': Ger^. 4i£ Schachtel , die Schatulle , der Kasten , 
die Kiste , das Kastchen, das Etui , das Futteral, der Karton, der Koffer, 
die Dose , die Buchse), the user will see that the English word-fo^-ms for 
these concepts are neither absolute nor universal. 



Keller l8 



All useful sets must appear in the topicsl list, but the component 
members of ^hese sets should be stated in a general manner, since foreign 
languages raraly have equivalents for all members of a set of objects. 
An example of such a series or set is ROAD, where not every foreign cidture 
can match esih individual member of the set: 'path, alley, lane, way, 
street, road, avenue, boulevard, highway, expressway/freeway/interstate' . 
Different cultures have differing subdivisions fpr COAT/JACKET as suggested 
by the English concept-divis'ions: 'ivercoat, topcoat, frock coat, coat, 
three-quarter length coat (car coat), field Jacket, lumber jacket, 
vindbreaker/anorak, suit Jacket, suit coat, C. P. 0. shirt,* etc. This concept 
list is overly specific. A series list is more useful when it has more 
general subdivisions ('overcoat, coat, field jacket, suitcoat). It is also 
helpful if tiie user is made aware of the value of^?il5fural' correspondences 
or equivalents, rather than making exact translations. 

The Advantages and Limitations of English Word-Forms for Universal Concepts 

There is a great advantage in writing the concept list in one language 
only. The student receives active practice in using the list by writing 
/down each new foreign word as he learns it. This is much more useful than 
the student's running his eyes down an already prepared list of equivalents 
in both his native and foreign languages. The concept of the topical vocabulary 
checklist as a preprinted notebook was mentioned earlier. An additional use 
of an "English only** list is for students who are learning more than one 
foreign language. If that student has two blank copies of the checklist, 
he hfiis a very graphic device for measuring and matching his knowledge in 

2U 



Keller 19 

both languages, and he can transfer those insights gained from studying 
his first foreign language into his second foreign language. When these 
two foreign lemguages are closely related (e.g. French and Italian ^ 
Spanish and Portugese, German and Demish, Russian and Polish), the 
user of the lists can see the great similarity of vocabulary, and can 
build his knowlgdge of the second language on his knowledge of the firsts 

The oxj^^T^^ -disadvantage of an "English only" concept list is 
that it will not have a specific entry for the many culturally unique 
aspects (food, clothinfe, etc.) of a foreign culture. Here again, Judicioias 
amounts of blank space after each subheading will penait the student to 
catalog culturally unique words such as Sp. tamale , taco, enchillada, 
chili ; Rus. vodka , borsc , sci , kolbasy , zakuski , etc. No concept list 
or implied division of reality could ever pretend to be complete. Blank 
spaces give the checklist another dimension of flexibility, and the 
student is also free to "personalize" his list by expanding those areas 
where he has a specialijjed interest in technical words that do not appear 
in the first 7,000 word freqxiency range. 

A good topical vocabulary checklisj^ will stand or fail on the 
strength of its ahility to classify, order, and list words. The main section 
of this article has shown the seven problem areas where JU)P routines can 
help to achieve a balanced list. The last consideration of such a li^s^ 
are the frequency notations of the words themselves. 



Er|c 21 



Keller 20 

The Need for Frequency Notations 

The list is based on the 7>00G most frequent words of four languages, 
but it is still important that each word have an individual frequency 
notation. This is a great help to the student in that it eases the 
arduous process of learning and rc%'ieving large numbers of foreign 
vocabulary words. The shorter versions of the list mentioned earlier 
(2,000 words and U,000 words) are a morale booster in that they establish 
intermediate stages of proficiency — subgoals that give the student a sense 
of accomplishment when he masters them. If a beginning language student 
sees the full array of 7,000 language concepts that remain to be learned, 
he could be easily overwhelmed^ by this vast mass of words, and could 
decide to discontinue further language study. 

The computer permits rapid establishment of intermediate listings 
undeV any frequency criteria that a ^^ompiler or publisher might desire - 
The attendant task creating the various indexes for each intermediate 
version is also made easy by computer routines. 

One major reason for any topical listing is to reduce large categories 
to manageable levels. If each word has its own frequency"identity" ,the 
student can set intermediate goals ou his own even if he is using the fuLl 
7,000 word version of the list. He can include or exclude subcategories 
according to his interest in them or his need for them, and he vill no 
longer be overawed by vast quantities <?f words that he must learn. Even 
though the student's intuition or common sense will tell him that some words 
are more important than others; the student still needs frequency indications 
in the majority of topics to gain perspective concerning the importance of 



Keller 2i 



large numbers of vocabulary' vcrds. It vill be obvious to the student 
that the names of berries* grains, troes* ai»i bird^ are not of hi£jr* 
frequency, and that 'eyelid, eyebrov, tenple, nostril, knuckle^ knee, 
ankle' are definitely less important than 'head, eye, no5e» finger, 
leg, foot*' It will be very difficult for the student, however, 
to determine which abstract nouns are oore iurportant than others, 
and which verbs and adjectives should be learned before others. 

A vocab^jlary notation next to each word gives the ''*^ient an 
additional dimension of choice in learning words* He can establish 
^ different limits in each topic according to which thousand group he 
wishes to include. He can learn all words in categories of special 
interest to him, and he can osxit^ for example, all words beyond the 
first or second thousand group in other categories. If he must step ^ 
hiat language studies at any time, he has an exact record of the categorie 
and frequency limits be has studied, arid he will not need to duplicate 
these words when he resumes .his studies at a later date. 

Source of Frequency Notations 

As mentioned earlier, the use of Eaton's four-language frequency 
dictionary as a source list brings the advantage of giving the 6,500 
Bost common words of these languages tcgether with a frequency annotation 
for each word. The original date of the work is 19^*0, and it in turn is 
based on four monolingual frequency dictionaries that date from }?53P, 
1929, 1927, and Kaading's monumental word count of 1898 -1^ Thus, many 
words in Eaton are "dated**, and there are many modern words which are 
not included. 



Keller ?2 



iiior^ than c^^sp-enaatc* tor ilT^ a^f* . 1. cver^^ *ori *r> sc.^rkci for ^'^rt -V 

speech; ?. **ver/ word iztlicitly ler2natin'?fi ;f;tc x^c^ s:a vor (jj^ci^t 

frequent) concepts; ^id 3- every vord is lir^ted with ?in e^uivalet.t ;n 

three other langua£e*3* 

The tvo sjost useful iLodem vord atudiet^ tore besea the Kuccrs 

and Pra^ncis courit of the Brovn University Corpm (BUC) and John B. 

/ 

Carroll's c^aura of the Anaerican Heritage latercDsdiate Corpus {AHIC/.*^ 
The 3UC 11^ a computer analysis of 1,000,000 running vords of ail 
apeciadized topics (sciences » hvananities, literature, etc*) and area^- 

difficulty, aad v^s keypunched in 1963-5^* "H^*? AHIC is a cocputer ^ 
count of 5*000,000 running vords of graded schcol reading ssaterial frot^ 
grades 3 to 9* The AHIC was designed to produce a citation base for 
The Anericy HeHjtage School Dictionary ^ and has also been used in the 
preparation of vocabulary questions on college entrance exarinatioss. 

It »UBt be erspha^ited that these tvo stadiee are sot word counts 
but rather counts of logical fonss . The studies define a *word* as any 
string of graphic cliaracters preceeded and foll^^ved ty a space. This 
logic, while necessary for cojsp^utational analysis, renders the E"C and 
the AHIC of little use m setting up frequency nunb^ro for s^^aningf^ii 
concepts, 

V. The lack of parsing (indication part tf speech) nalllfi^^s th^ 
value of Doth studies for vordi analysis. All noon plurals, no.^» po^.nosta ves , 
3rd sg. verb forats, and past tense or past part in pie verb forr,s arc counted 
separately frc*:: their non-inflected noon aj.i v» rh fortz:^. An adiition^il 
complication is that all nouns, v^rbsa and adjectives with the rane 



Keller ?3 

\ 

\ 

graphic fvr^ C'^^S- '^1-^^, >^r^y, .^r.^-rf*' ;t , iiv^tl<* , a7^'\*) are lx5*ei 
45 one forts cr.Iy^ v;*."!; r.c^ rord >C tn*- part ':^f i u^ed th*. ;^w^rr*7 

\ 

range that tmv'.* aii In^eniic^J fcNr^s for either nvi-xn/v^rb Cc *var* ^ u^^. 
vorK , look, taiLH^ h<?ip.*;* noun/adj*^ : vc (p.g. *drurJ<, coid» noble, 
concrete^ secret » t:;ag;c*lt or verb/ai^^ecti vc (e.g, *dull, blunt* scsootbe, 
equal, bUnd^ clean*). To cite the vord *word* an eitac^ple^ the AHVTB 
lists eeven graphic forar* *war4. Word, vord*a, vordcd^ vording, words, 
Words'* In order to fxod tlie *otal rrc»quc*ncy ':?f tno concept *w^rd% 
one cust add a^l mdividjal frc-quenors cf t;.c seven form. Even tiere 
there X9 ^till no accurate indication of tht* r*?lativt? valuetj of 'vord* 
mm and 'word* ^*TRB. 

The EUC €tf;d the /UIIC c^^^ provad*-^ a j^ef serv^;^.'^ for the topical 
vocatulBry cnerKIi^t, h^Tw^ver, by prov^iding a *'t*i:Kup** tret of ffequencics 
for the worii: Iic^ted in Eaton, A check of tne first 7^000 BUG arid AHIC 
listingt^ vill at leaL$t indicate vbich "s^d*?ra" v'rdt^ are m^iung in 
Eaton, and vhich vord^ have gone ^^it of frequent u:^e evince the coc5|^ilati< 
of tfc*? Eaton source frequency dictionaries^ The BUC ^d AHIC Xveyxr^ntn 
riotfttiOHB vouH not need tc- b»» pui)lxt.h-.?d vith the toricaJ voc&b^al^^ry 
chc*cKli£.t, sin^e they vojld te of int^reat only to the las^Vi^ge rer^earcr* 
A ianr,\*^t^^ trholar vc^ld :?*ieed find ft. :inntion \r\ ^Ui %' xaain at ion of 
the shifting values of ♦ne rjort frc-^ut-nt T ,000 worda .-'f a language xx, a 
period of iO ytar^. 



Keller ?>* 



ERIC 



in the c*r-taM J. Kh^ent c^^nc^j-t- ti^t,*^** ♦.n^u-al v ---vtul^iry ^'hetkl^st* 

vord tod category rc^huf fli^i^:^ un*il the word arr^tngiJtcnt *;*'c:r^i "j'^^- 
ri{^ht**» ftr*d ^-xntxl the ii^n shovii a hagb ^i^fire*^ of jntcraaf ccnsif t<^nc>% 

Even thv^u^, it -j.s xcf-ostsibli? to cia^^^it^ an^i divxd'* r^^aliiy into 
a fixed •,iinivr-r-,al cate^prie^ ^ubcategoru^r- » the attempt t^o 

do so pro&^cen ^ list that has grf*at sserit ac^ a tr^iChing d*?vice ^vcn 
tbe diecr<?f^mciej^ betvcen the Englien lii^t and the ^vgct lar^guage 
serve to giv«? thi? stu.l'^'r.t in m^ignt u;^^^ how- both iajngii?if.es o^dc-r 'ir^d 
eatpr^ss reality. 

As the studt^nt -0,^^:3 the h*^ vili gam respc-c* for itn 

coKpieteR-.*5§ * flir*d he viH th?j* *h^^ vtii^t complexity of real ity 1^ 
much jsor*'' ea^iily ccrpreh*^ndr-d m forei^Ts liu;|?:uv.*? )f th*? soveral 
tbousaj3d v^ri foir^ are r^iirr'ing/^d into wofVnbl?- .^fitegt>rio?t . pven 
if th^ stul*r/^t €*ncouiiters n^-v f-^r^igr. voc^bulnry :\f:txr^ in .^rr ^'>red 
or "fi«u*n ceri seq jt-nce , he >a/5 vntt* Jovn under ordered topics 

an d he c an e c Dp ar e t n t j ; ^ n t i r 1 y r e I ?i t d vo r d / 1 n t h c f r 
topicn th^it h'." alrr-fiiy 

the ^tudeat 4 vniiblf" etvi tc | r-.^ . : 3 f n-i ^t^rir-^, a forf^-^^Ti lan^jj%ge» 

and It pcrniit^ the Ctt-^.I^nt to r:-\f,itrr hi> p ro^*rr^" ^- m ai^sinin^ riur-f:ry 
ir* that Ai titr.---^ ^.nc hi. n -^I-^nr v:fv r^f hnv s!:;-jrh 




Keiler 25 



he kr<r-r. and hov nuch recmint> be le-anei. ThiL persi-f^ct; essential » 
since otherwise the learning prcce^s vouM te an exercxi'e m learn:r.f? 
7,000 unordered words. 

The 'wOpicai checklist ig a use^xl device for the !nar,y recess ^'-j' 
vocabulary reviews that are associated with ia^ngua^tr learning, ks p 
student checks off vords that he Jcnovs in a categor>', he receives 
cncoura^cisent to learn the retaaining vords, especially vhen the propcrtioj^ 
indicete that he has already- learned TO? or 3oJ cf the vords m a given 
topic. 

If a topical voaabulary checklist is made a part of the curriculur:^ 
of aVanguage coursev^he stut^ent v :i i^eceive an added awareness cf 
ids ma^ery of the language: he will know what be knows^, and be will 
know what he doesn't know, A student with a feeling for bis own vocab- 
ulary strengths and weaknessei^^will be auch better able to participate 
in speaking and conversation exercises, since he will know those areas 
where he has a detailed vocabulary competency, and those areas where he 
»ust paraphrase his thoui^ts m order to be understood. 

Awareress of i.is progress in language xs a valuable encourageiien* 
to students of any level in any* language course* Today's declining 
language enrolicents place a great deal of %'aliie in encouraging stuient^ 
in the face of the complexity of a foreign language . 



'J 



Keller 26 



grant vill finimce ^h- :cr:t%t*'r w:rk ncct^^j-ir? tc prep-are a tcrxcal 
vccabu.lar%' cneckl::^'.* I v:sh to thark the .^cr^rittee ani its cnainiia-i. 
Dr. Ktr^neth Harrel*. fr.r the fa;tr, they nave shcr-T, :n this project. 

Robert Lade » L-ur ; w>a£c Teaching . A Scienti f :c A pproach (?«'ev 
York. Mc3rav-H:ll, i^- *^ ' p. 117, gives these figures a result of 
♦'xasuniniT various fcneral service vord l;st3. 

^ J^7hn B. Carrcll et al- , The Anerican Heritage V-^rd Frequency 
Book (Nev York Arx^rican Heritage Publi^r.ing Cc, 1971), 

rhxs IS duf* tc gieat differences ir. frequency lists vhich in 
turn eLT*.^ cc:npiled vxth differing cr:ter;a fros different types of 
fiburce texts. A '^.udy of the five leading ^^uasiar* textbooks and the 
three accepted vori frequency lists shcved a great variation of inclusion 
and exclusion (Niche la.^ P. Vokar, ''Statist :ica: Methcd^ in the Analysis 
of Bussia;:/' S:^v:r \ni 1^2} -^'\^2^ ima l , 11 (1967;, 59-^"^- 

;'>ce th€ p:,tr>' "FtI^*' m '-erner ViM^-.e, Mcderr,e Lmg-uistik : 
Ter-rr.^-lcr;^- , rit : . -r- - rie. "Mur.rhen Max Hueber Verlaj^* l:-"?-)* 

for a ccnvt;.': t*. i^r.'.:.* -f thf v:,ri fielz thcc^ry fcr a use!":^ list 

of av3cri : 1 :^-cr .t ur- . 



Keller 27 



^ Helen S. rater.. An En*r,lish-French-Gen:;an-Sran:sh Vord Frecu'. :iCy 
Dictionar>- (Nov Ycrk* Dover Publications, 196l). This vork a reprint 
of aa earlier editicn entitled _Senant ic Frequency List for Er.glish , 
fre^xb , German, and Spanish (Chicago: University of Chicago Press, 19^0). 
This dictionar;/ establishes the frequency ranking of a vord by "thousand 
groups." Eaton averages the frequency annotation for each vord in each 
of foij* languages. Thus, the notation 1,0 designates a vord that occurs 
in the first thousand group in all four languages. A notation of 2*6 
indicates an avc rage frequency ranking of 2,600 in all four source 
dictionaries; 3.9 indicates 3,900, and so on. 

The actual number of vord-concepts in Eaton is 6,^73, but I use 
the number 7,000 in various sections of this article because it is a 
convenient round nunber and is approximated by additional proper nouns 
listed in the appendix of the Eaton dictionary. 

^ W. F. Mackcy, Languag e Teaching Analysis (London: Longnians, 1965), 

p. 69, giveci the sanie example in Esore detail: 

head of a person tSte 

of a bed cheve- ^ 

of a coin face 

of a cane pocaae 

of a match bout 

of a table haut bout 

of an organisation directeur 

on beer pousse 

title on a page rubrique ^ 

g 

The 5,000,000 vord corpus of the f^jTnTB contains a ntin:ber of nonsense 
vords-^-sound ccabinations (or vord-fcms) that vovild be acceptable according" 
to the canonical shape limitations of English, but that do net happen to 
have a concept-oeaning assigned to thes: billitch, binning, trendly, 



er|c 2u 



Keller 28 



brosket, clob, crobb'e, floffle, frobish, froon» frumious, froms, 
gloobed, gcrbed, gril.ble, grop, grumple, motch, nugful, quinking. 

^ The AHWFB goves two listings for each word type in its 5,000,000 
word corpus. The frequency (F) listing is the simple record of how many^--.. 
times that a specific word-form has occured in the corpus, from 'the' 
with an F of 373,123 to the 35,079 word- types which occured only one 
time (hapax legomenon). The absolute frequency notation (F) has been 
adjusted for even distribution throughout all 17 text specialties, 
however, by the calculation of a distribution value (D). The result 
is a second frequency indication, a theoretical but statistically 
plausible frequency-per-million value (FPM). Thus, the form 'but* 
has a fairly proportional distribution throughout the 17 text specialties, 
and has a FPM value of 73,122,8. The form 'covherd' has an F of 5, 
but is limited to only one specialty and has a FI^ value of 0.228U. 

The less frequent FPM vfitlues are usually calculated to four 
places after the decimal point, and the FPM represents the statistical 
probability of encountering a specific word- form in a corpus of 1,000,000 
words. The FPM value has an additioncLl advantage of permitting comparison 
with the many word frequei^cy studies which have been done on corpora of 
1,000,000 running words. 

D. A. Vilkins, Linguistics in Languar.e Teaching (Cambridge, 
MasL:.: The MIT Press. 1972), p. 119. 

er|c '^'^ 



Keller 29 



The source dictionaries are: Edward L. Thorndike, Teacher* s 
Hqx4 qL 20.000 Words (New York: Columbia University Teacher's 
College, 1932) which uses a 9»565»000 word corpus; George E. 
Vander Beke, French Word Book (New York: Macmillan, 1929), a listing 
of 6,067 word-types from 1,000,000 running words; Milton A. Buchanan, 
Graded Spanish Word Book (Toronto: Univ. of Toronto Press, 1927), a 
listing of 6,702 word- types from 1,200,000 running words; and Friedrich 
W, Kaeding, Haufigkeitsvorterbuch der deutschen Sprache (Berlin: 
E. S. Mittler und Sohn, I898), a listing of 79,7l6 word-types from 
10,910,777 running words. 

The 2*000 500-word texts that make up the BUC are available 
on computer tape. The count exists in published form: Henry Kucera 
and W. Nelson Francis , Computational Analysis of Present-Day American 
English (Providence: Brown Univ. Press, 1967 ). 

The Americem Heritage Intermediate Corpiis count has been 
published as The Ameri can Heritage Word Frequency Book (op. cit . ) . 

John Lyons, Introduction to Theoretical Linguistics (Cambridge: 
Cambridge Univ, Press, 1968), p. Ul47-1*8. See also Lyons, Structural Semantics 
^ (Oxford: Blackwell, 1963)* Stephen Ullman, The Principles of Semantics 
\ (Oxford: Blackwell, 1957) i p. 108-09, and Ullman, Semantics : An Introduction 
to the Science of Meaning (Oxford: Blackwell, 1962), for a fuller treatment 
of this idea. 



Keller 3o 



Table 1 

The headword for each category is given in one or two main words. In t*his 
overview the headword dc^s not give a complete or detailed idea of every 
word in that particular category. The nmber of nouns in each category is 
indicated in order to give an approximate idea of the relative size of each 
category. It must 6lLso be remembered that -these numbers are subject to 
change, since words are still being reshuffled among categories. 



Category 1, 139 nouns, 
world 

the country 

mountain 

water 

Category 2, 35 nouns, 
agriculture 
crops 

Category 3, 63 nouns, 
city 

store, shop 
castle, palace 

Category U, 29 nouns, 
weather 

Category 5» 70 nouns, 
element, minereLL 
metal 

wood, lumber 

element 

color 

Category 6, 65 nouns, 
animal (domestic) 
animal (wild) 

Category 7» 38 nouns, 
bird 

Category 8, 10 nouns, 
fish 

Category 9» l8 nouns, 
insect 



Category 10, 79 words, 
tree 
plant 
flower 

Category 11, h9 words, 
time 
day 

calendar 

Category 12, 69 words, 
profession, 
occupation 

Category 13, 71 words, 
admi ni s t r at i on 
management 
technique 

Category lU, 107 words, 
person 
friend 
fool, idiot 

Category 15, ^5 words, 
family 

Category l6, 33 words, 
life 

Category 17, 138 worfls. 
food, meal 
bread 
meat 

vegetable 
fruit 
dessert 
drink 



Category 18, 138 words, 
dwelling , 

residence 
house 

living room 
dining room 
kitchen 
bedroom 
bathroom 

Category 19, 82 words, 
machine 

tool 9 implement 
box, container 

Category 20, 109 words, 

clothes 

suit 

dress 

clc*h 
• sewing 

Jewelry 

costume 

Category 21, 109 words, 
body 
head 
trunk 
limbs 
organ 

Category 22, 51 words, 
sickness 
hospital 



Keller 31 



Table 1 (contd. ) 



Category 23 , 30 words. 

luck, chance 

success 

failure 
* misfortune 

Category 2U, kO words, 
mind, intellect 

Category 25, 2hk words, 
feeling, sense 
mood (happiness, 
anger) 

love 

care, worry 
sorrow 

honor, respect 
hate, hatred 

Category 2$, 37 words, 
education 
school 

Category 27, 2k words, 
science 

Category 28, 10 U words, 
cu"* ture 
painting 
sculpture 
theater 
music 

musical instrument 

Category 29, 25 words, 
literature 

Category 30, U5 words, 
language 

Category 31, 122 words, 
communication 
conversation 
trutjh, lie 
answer 
discussion 
sign, Symbol 

Category 32, 83 words, 
writing 
book 

newspaper 
letter 



Category 33, lU2 words, 
religion 
church 

worship, rel. service 

clergy 

marriage 

funeral 

superstition 

Category 3U, 33 words, 
coispany, society 
way of life, habit, custom 

Category 35, 53 words, 
club, association 
hospitality 
smile, laugh 
gift, present 

Category 36 , 6I words . 
hobby 
toy 
sport 
riding 
hunting 
cards 
chess 

Category 37, 97 words, 
trip. Journey 
holiday, vacation 
transportation 
automobile 
shipping, ship 
aiaT)lane 

Category 38, 175 words, 
country, nation 
goveniment 
monarchy (king) 
election, politics 
congress, legislature 
revolution, revolt 

Category 39 » 75 words, 
law 
court 
crime 
prison 



Category UO, II8 words, 
economy 
money 
finance 
bank 

Category Ul, 1U3 words, 
war 

campaign, victory 

weapon, arm 

Army, Navy, Air Force 

officer 

military unit 

naval ship 

Category U2, 153 words, 
thing, object 
matter, thing, affair 
manner, way 
order, disorder 
qxiality, characteristic 
cause, effect 

Category U3» 115 words, 
part, particle 
top, bottom, side 
point , dot 
space, place 
direction 

Category kh^ 55 words, 
ntonber 
qiiantity 
measurement 

Category 1*5, 80 words, 
act, deed 
renewal ^ 
motion 
activity 

Category U6, 35 words, 
phenomenon 
somd, noise 
radiance, glow 



ERLC 



3o 



Kellep 3Z 



Table 2 



Sample Topic: OU WEATHER 28 words. 



Logical Orgeuiization. 


Alphabetical Organization. 


l.U veather 


breath of wind 


b.O climate 


breeze 


1.9 cold, coldness 


climate 


3*2 cool, coolness 


cold, coldness 


1.5 heat, warmth 


cool, coolness 


3.0 temperature 


dew 


6.6 thermometer 


fog 


2.0 rain 


frost 


2.5 shower (rain) 


gust of wind 


5.8 rainbow 


hail 


2.0 snow 


heat, warmth 


7.8 snowflake 


ice 


2.U ice 


• lightning 


l.U wind 


mist 


5.2 breeze 


rain 


6.3 breath of wind 


rainbow 


5.7 gust of wind 


roll of thunder 


3.6 th\mder 


shower (rain) 


■f.x bnunGLerDoxb 


snow 


I4.3 roll of thunder 


snowflake 


3.3 lightning 


storm 


5.3 hail 


temperature 


3.1 fog 


thermometer 


3.1 mist 


thunder 


3.9 dew 


thunderbolt 


5.0 frost 


weather 


1.6 storm 


whirlwind 


6.2 whirlwind 


wind 


4.9 hurricane 





ft I I 

'hurricane* has been added from the AHWFB. Its FPM is 13. giving it 

an e(iuivalent Eaton ranking of I4.9. Some additional weather terms in the 
AHWFB were evaluated, but did not merit inclusion because of frequencies 
below 7,000: •cyclone* (FPM 2.6203, ranking 10,900) ^d 'blizzard* (FPM 
3.501*3, ranking 9,1*00) were two of the words evaluated. 



ERLC 



Keller 33 



Table 3 

A list of forms tliat include two or more different concepts or objects 
in\heir meaning extensions. A semantic link is present in all pairings. 
This is only a partiaLL listing of polysemantic word- forms. 



FORM 

apartment 
appointment 
balloon 
bank 

board 

cabinet 

cardinal 



cell 



chest 



class 



coal 



conception 



CONCEPTS 

1. an individual dwelling for a family. 

2. a building containing many individual dwellings. 

1. a date, a pleuxned meeting 

2. a post, job, or position. 

it 

1. small — a child's toy. 

2. large — an airship, a lighter-than-air transport. 

1. mound of earth by a river. 

2. an institution for the storage of money. 

3. a row of similar objects (e.g. a bank of batterie:-). 

1. wooden plank. 

2. food and drink. 

1. storage compartment or small closet. 

i?. governmental advisory group of ministers. 

1. a church cleric or dignitary. 

2. a red bird. 

1. ^ basic organizational unit (e.g. of a political croup), 

2. the smallest structure of a biological organism. 

3. a room in a prison. 

1. part of the body. 

2. a box or trunk. 

1. a grouping or collection of items or people. 

2. a group of students. ^ " 

1. a mineral used for burning. 

2. emy glowing ember. 

1. an idea, concept, or image. 

2. the beginning of life for an embryo. 



ERLC 



3f) 



Table 3 (contd.) 



Keller 3^ 



FORM 

correspondence 
date ' 

demonstration 

drop 

duty 

engagement 

f€w:xilty 

fall 

gas 

bom 

industry 
lock 

lord 

memory 

office 



CONCEPTS 

1. similarity, matching of tvo objects* 

2. an exchange of letters. 

/ 

1. a numbered day on the calendar. 

2. a meeting or appointment. 

1. shoving or explaining something. 

2. a public protest or manifestation. 

1. the act of falling. 

2. a small mass of liquid* 

1. obligation. 

2. a customs fee. 

1. a meeting or cpi>ointment. 

2* a betrothal. 

1. a specific mental ability. 

2. the teaching ctaff of a school or university* / 

1. the act of dropping. 

2. the season (autumn). 

1. vapor. 

2. gasoline. 

1. of an animal. 

2. of a car. 

3* a musical instrvonent. 

1. manufacturing. 

2. diligence; exl ibiting perseverance. 

1. a key-operated device. 

2. a vater-regul&ting gate in a canal. 

3. a bend or twist of hair; a curl or ringlet. 

1. the Lord and Gavior. 

2. a titled nobleman. 

1. the ability or faculty of recall. 

2. a specific remembrance of an event or happening* 

1, a place of business. 

2, a position, assignment, post. Job, or working capacity. 



ERLC 



3(> 



Table 3 (contd.) 



Keller 3$ 



FORM 

operation 
order 



painter 

peak, sufiooit 

period 

pia 

pipe 

plant 

powder 

pupil 
review 

ring 

root 

shade 



CONCEPTS 

1, any activity or umdertaking. 

2. a surgical procedure. 

1- a fixed plan, systea, or arrangement. 
2. a state of peace and serenity. 
3# an authoritative cocaaand. 

a request to make or supply something. 

1, an artist. 

2, a person who paints houses. 

1. the top of a caountain. 

2, any high point. 

1, a length of time, 
a. a dot or point. 

1. a sharp, pointed object similar to a needle. 

2. a piece of Jewelry. 

1. a metal tube or conduit. 

2* a device for smoking tobacco. 

1. vegetation that is growing. 

2. a factory or industrial installation. 

1. a fine, dust-like substance. 

2. cosmetic face-powder 

3. gunpowder. 

1. a schoolboy or schoolgirl. 

2. the center of the eye* 

1. looking over or examining something. 

2. a magazine. 

3. a military parade. 

1. a circle* 

2. a finger decoration in the forts of a circle. (KER 2) 
3* a sound made by a beil* (SKER 3 cir.d KER 2) 

1. underground tendrils that form the base of a tree. 

2. any source or foundation. 

3. the stem of ^ word, 

1. a shadow protected from the sun. 

2. a hue or tint of color, of ineaning. 



Keller 36 ^ 

/ 1 

Table 3 {contdJ 



FORM 
sheet 

shift 

solution 

spring 



square 



staff 



stream 



stroke 



tsnk 



turn 



uiuon 



vant 



vatch 



wave 



coscEni; 

1, of paper. 
3» of bed, 

1, any change or alteration. 

2, a group of workers, 

1. resolv^ing a problem. 

2. dissolving something in a liquid. 

1. a jump or leap upwards 

2. a coiled length of cetal. • 

3. a season of the year. 

1*. a source of water in the ear*b* 

1. a four-sided eq^uilateral geooetricai figure* 

2, an area in a caty bounded^ by streets on four sides. 

1. rod^ sticky or pole. 

2. a group of officers ^ "•'^^'^^''St assistants. 

1* a brook or small river, 

2, the flow of people or things. 

1. a blow or strike. 

2. a blood clot in the brain* 

1. a large container for liquid. 

2. an annored tnilitary vehicle carrying a cannon. 

1. a cosiplete revolution or'H.um about an axis, 

2. a change or reversal of a course or direction (to ri^t^ left 

3. a change in circumstances* policy, health, events, etc. 

1. togetherness; the fact of being Joined. 

2. any \inified or federated group. 

3. a labor group or workers* confederation, 

1. a lack, scarcity, or shortage, 
2* a wish or desire, 

1. visual surveillance or guarding. 
2* a ticjepiece worn on the wrist. 

1. an up and down signal with the hand. 

2. the undulation of water. 

3. a regular formation of hair. 



ERLC 



38 



Keller 31^ 



Table 3 (contd-) 

I* road» path, 
2, oethod^ aanner. 

1. labor; a job» a tar^K. 

2. a vorR of srt^ u-n ceuvr*". 

youth 1. ft young person. 

2. a tice of fe. 

Note: l^e tvo or three concepts listed for QCb foro ar^ by no scdjiS the 
cosplete range of tbat fore, Oaly a few of the sore repre&cntat; ve concc^ptt> 
are listed^-tliose concepts who^e foreign language translation-^ have a liie^b 
frequency. Wordo IiKe *order' and *tum' bave a great variety of dictionary 
defioitions* ?ind it is also cLifficalt to draw a l>o^dary between c:any of 
these craning cxt.enfiiDDs, 

This short sacple of hosaonyt^xxc pairs snow-^. concepts that are not s^tsant ically 

related, but thnt annre the vord-fors a renult of linguistic 

coincx denc<^ 

^FOI&! DISSIMIIA.R MEAHIHGS 

ball 1* roixnd object uaed a t^y or in sport. 

2* a dUmce or dancing party* 

bill U a stateocnt of t^sey oved, h note of papier currency. 

?. th<^ beak of a bird; the bris of a cap, 

cape 1- a headland or pro?s.intory over water. 

?. an art^icle of outer ciothing- 

match 1* a si^larity or corrc^^pondence, 

? a chemically-tipped piece of wood ^5ei tcc ^^tart a fir** 

y 

pawn 1, ft p*ece of e chesr^ set- ^ 

an item pled£t*i to guarantee a Icnn ^ 



Keller 38 



post 
teeple 



* V* «*\*» 4 «- 

1. n stftK^ or p-oie. 

Ic im ethnic cl-^Svi ficntion . 

2. h c-02jpcieti v^r gpcei run. 

I. that pnrt vfach left ovt^r or rersainc^ fro^ f»ot!sething. 
reli3JKMion* n^core, or r^*fre^rjr^ct . 

1. n galping jsotion in thi* throat . 

2. a bird, 

I. ft place -of vortJhipc 

the side of the foreheai. 



It IS iocvitabie tbat *Auch coxtf^denc^^ff >^U1 occur* aince laj^guage hm 
a relatively seal! irtventor/ of words Cra. 30, OC^ to 50»000) e^cpress 
siillaonB of dxffi^rent concepts* and ob^cctfs of reality > Given the fact 
that the averii#,c length of a wori ir^ between tuo &cven 60unis (or 
letters), and th^* fact that Sior^t Imftungt^t^ have a sotxnd inventory of 
30 io ^5 soundB, it 1*^ sot surr^namg that vord^-fonss have s great 
burdcD In cooc^pt-erpre^tuon ^md concept«r«*presC'Station> 



III 



Keller 



39 



Tab:*- - 

In exanurang th:s it. l.e rcc;er:bore:: ',r.n*. %c \mz^ vcri.^ c^at, 

substxtute for one a^n.jthcr in all env'i r3'nr:er.ts ar:l r-*yle unes. 7r.i^ 

list 15 not CGtnplete* It i:- arrari£ca acccri.nr *-0 TCFICJ hr,i :*--l^it:: cnly 

Qoun exa^ij^lt-r, m the tcricai vocabulary cncciclicT. ( 

sod - turf advice coujiStl 

dock - pier help - aid, a^2ict^tr. • 

coast - shore obstacl^^- - barrier 

order - cosr::a.nd 

3. CIIY ^ 

store - shoi lU. PERSON 

factory - plan' forexgner - bJ-ier. 



beat - vamr. gravy 



fog - rrjLSt 



- sa u;re 



5. Et.H>lL:ST, MISrPJU. cellar - baoetncr;* 

chafia - porcelain foyer - entrance hall » vczt^ 

bue - tint, snalc bal' - corr:i':r 

pillar - coluxn 

6. ASIMAL couch - sofa, divan 
donkey - a^a 

7. BIPLv motor - engine 
fowl - poulto* shovel - bpadc 
feather - pluir^t- string * cord 
pfgecr. - dove plank - board 

ba^ - sack 

10, FiAT^T paii - bucket 

shcMDt ^ sprou* 

parAr. - trcrarer:. 

21. b.;:y 



1?, FRCrL.M'N stona- 
profession - occ-;pat:cn 

office - Loreau 22, ZlCY^^tZ 

perBoan*-! - staff ache 

doctor - physician pi:.! tablet 

lavyer attorn^^y 



- HA IT. 



ERIC 



Keller 



:at le M (contd. ; 



if 3 « L V C K » C ;^A^* 
probabil: 

nusfortur.e 
a«ony 

concept 

understELndme 

ability 

{ feeling 
( feelme 
bliss 
longing 

(gi fl- 
ange r 

Tury 

zeal 

pity 

(coafor* 

Icotsfort 

(a worry 

(vorry 

faith 

trust 

respect 

boredom 

lazines:? 

courag^^ 



26. EDU^An^N 

test 

fiustah*^ 

ansver 

32. Vh: I :.N : 

author 



ac3t in:- 
anguisr 



noticn 
apt X t u^'i^ 



- sensation 

- etaoticn 

- raoture, 

- yearning 
• talent 

- present 

- wrath 

- rage 

- fervor^ ar^^/ 

- conscIat;cr. 



feajt 
ball 
:'naticn 



- a care 

- anxiety 
* belief 

- confide: 

- este^^n 

- Sirth 



1 ...'irr 
rr::: 



case 



3^. COKl AN : , 
pairdc-n 



36, SPOrr: 

obstacle 
trap 

traveller 

vacation 

autcc^.t;le 

or de r 
kingdom 
Donarch 
earl 

LAW 

lav 

fine 

p r 1 s c n r 
prison 
tn^ef 
will 



coz^^any 
trade 
value 
salary 

attack 
spoils 
exile 
re ru£/'= 



ii:::-^t.er:al 
{ci/stcria* 
thijj^ 
event 



- banquet 
* dance 

- coDtr:t ; 



* n'urlle 



' Joamcy 

- tourist 

- holiday 

- car 

« frontier 

- erspire 

- sovereign 

« ccxmt, barer 



- statute 

- penalty 

- convict 

- Jail 

- robber 

- testacent 

- fira» corporat 

- consserce 

- worth 

- va£e, earnings 



assaalt 

booty 
* b^inishsient 
" haven, aayl^ 



i::atter 

cloth 

object 

incident 

tehavior 



ERIC 



1': 



Keller H 

Table ^ (coo*, i. . 

/ 

(state - ccnditicr ^3* TATT, 



- piece ^ 



(state - narioD bit 

kind - sort end - ccuclusicn, finish 

contrar>* - opposite start ^.begincir*? 

defect - flaw niiddle - /center 

evil - vickeiness rest ' )- VeiL^inder 
safety - security 

goal - ais, purpose hu, NU>3ER , h-^'AJ,'TITY 

anount - q'uar.tity 

shortage, - lack> want 

45. ACT, orlrD 

act - deed 

turn ' revolution, revclvinr 

\ 

Tnese infcnnal c:^r.cv7* -pairir.i:^ have teen estJatlisr.ed tc obtain accurate 



frequency r:'wj::i:er^ for the tctal concept. There is no implication that the 
tvo cezr^tera of any synonv-m i:a:r are identical r^eaning. The bracketed 
pairs ! md^cat- tvc 3>-ncr.yr: pa:r- together vith a concept difference, and 
tbe synon^T: p':i:r7 are nece5sar>* tc establish the polysenatic concept 
difference . 

This list IS ::.ruted to acuns. Verbs ani adjectives also provide nuxLerouc 
exair;:e of one concept Icinf expre33ed by tvo vord-fcrr:r. 

Even tncue.r. r^i^sy oyncnTza p'^:rs are rather loose ('kingdom - empire, spear - 
lar*ce, traveller - tourist':, at is felt that e better 'frequency indication 
is obtaiiicd : the total concept ratner tn^^n for each of the tvo individual' 
vord^forzr . 



